Performance Scalability of Multimedia Instruction Set Extensions
نویسندگان
چکیده
Current media ISA extensions such as Sun’s VIS consist of SIMD-like instructions that operate on short vector registers. In order to exploit more parallelism in a superscalar processor provided with such instructions, the issue width has to be increased. In the Complex Streamed Instruction (CSI) set exploiting more parallelism does not involve issuing more instructions. In this paper we study how the performance of superscalar processors extended with CSI or VIS scales with the amount of parallel execution hardware. Results show that the performance of the CSI-enhanced processor scales very well. For example, increasing the datapath width of the CSI execution unit from 16 to 32 bytes improves the kernel-level performance by a factor of 1.56 on average. The VISenhanced machine is unable to utilize large amounts of parallel execution hardware efficiently. Due to the huge number of instructions that need to be executed, the decode-issue logic constitutes a bottleneck.
منابع مشابه
PLX: An Instruction Set Architecture and Testbed for Multimedia Information Processing
PLX is a concise instruction set architecture (ISA) that combines the most useful features from previous generations of multimedia instruction sets with newer ISA features for high-performance, low-cost multimedia information processing. Unlike previous multimedia instruction sets, PLX is not added onto a base processor ISA, but designed from the beginning as a standalone processor architecture...
متن کاملRefining Instruction Set Architecture for High-Performance Multimedia Processing in Constrained Environments
Multimedia processing in software has been significantly accelerated by the addition of subword-parallel instructions to the instruction set architectures (ISAs) of modern microprocessors. While some of these multimedia instructions are simple and effective, others are very complex, requiring large, special-purpose functional units that are not practical for constrained environments such as han...
متن کاملPLX: a fully subword-parallel instruction set architecture for fast scalable multimedia processing
PLX is a small, fully subword-parallel instruction set architecture designed for very fast multimedia processing, especially in constrained environments requiring low cost and power such as handheld multimedia information appliances. In PLX, we select the most useful multimedia instructions added previously to microprocessors. We also introduce a few novel features: a new definition of predicat...
متن کاملPerformance Benefits of Special-Purpose Instructions in the CSI Architecture
The Complex Streamed Instruction Set (CSI) architecture was proposed in order to overcome the limitations of existing multimedia-oriented ISA extensions, such as Intel’s MMX and SSE. One of the main limitations is the large amount of instructions which has to be executed. In CSI, instructions operate on data streams of arbitrarylength, which allows to dramatically reduce the instruction counts ...
متن کاملImpact of Multimedia Extensions for Different Processing Element Granularities on an Embedded Imaging System
Multimedia applications are among the most dominant computing workloads driving innovations in high performance and cost effective systems. In this regard, modern general-purpose microprocessors have included multimedia extensions (e.g., MMX, SSE, VIS, MAX, ALTIVEC) to their instruction set architectures to improve the performance of multimedia with little added cost to microprocessors. Whereas...
متن کامل